Posted 2025-03-18Updated 2026-03-30Notea few seconds read (About 83 words) visits

(UVtransE) Contextual Translation Embedding for Visual Relationship Detection and Scene Graph Generation

The **Union Visual Translation Embedding network (UVTransE)**, which learns three projection matrices $W_{s}$, $W_{o}$, $W_{u}$ which map the respective feature vectors of the bounding boxes enclosing the subject, object, and union of subject and object into a common embedding space, as well as translation vectors $t_{p}$ (to be consistent with [[(VtransE) Visual Translation Embedding Network for Visual Relation Detection]]) in the same space corresponding to each of the predicate labels that are present in the dataset.

http://chen-yulin.github.io/2025/03/18/[OBS]Reconstruct Anything-Relation-(UVtransE) Contextual Translation Embedding for Visual Relationship Detection and Scene Graph Generation/

Author

Chen Yulin

Posted on

2025-03-18

Updated on

2026-03-30

Licensed under

(UVtransE) Contextual Translation Embedding for Visual Relationship Detection and Scene Graph Generation

Author

Posted on

Updated on

Licensed under

Comments

Archives

Recents

Tags